Improved Learning of Chinese Word Embeddings with Semantic Knowledge
نویسندگان
چکیده
While previous studies show that modeling the minimum meaningbearing units (characters or morphemes) benefits learning vector representations of words, they ignore the semantic dependencies across these units when deriving word vectors. In this work, we propose to improve the learning of Chinese word embeddings by exploiting semantic knowledge. The basic idea is to take the semantic knowledge about words and their component characters into account when designing composition functions. Experiments show that our approach outperforms two strong baselines on word similarity, word analogy, and document classification tasks.
منابع مشابه
Improve Chinese Word Embeddings by Exploiting Internal Structure
Recently, researchers have demonstrated that both Chinese word and its component characters provide rich semantic information when learning Chinese word embeddings. However, they ignored the semantic similarity across component characters in a word. In this paper, we learn the semantic contribution of characters to a word by exploiting the similarity between a word and its component characters ...
متن کاملLearning Semantic Word Embeddings based on Ordinal Knowledge Constraints
In this paper, we propose a general framework to incorporate semantic knowledge into the popular data-driven learning process of word embeddings to improve the quality of them. Under this framework, we represent semantic knowledge as many ordinal ranking inequalities and formulate the learning of semantic word embeddings (SWE) as a constrained optimization problem, where the data-derived object...
متن کاملInvestigating Stroke-Level Information for Learning Chinese Word Embeddings
We propose a novel method for learning Chinese word embeddings. Different from previous approaches, we investigate the effectiveness of the Chinese stroke-level information when learning Chinese word embeddings. Empirically, our model consistently outperforms several state-of-the-art methods, including skipgram, cbow, GloVe and CWE, on the standard word similarity and word analogy tasks.
متن کاملImproving Lexical Embeddings with Semantic Knowledge
Word embeddings learned on unlabeled data are a popular tool in semantics, but may not capture the desired semantics. We propose a new learning objective that incorporates both a neural language model objective (Mikolov et al., 2013) and prior knowledge from semantic resources to learn improved lexical semantic embeddings. We demonstrate that our embeddings improve over those learned solely on ...
متن کاملExploring Semantic Representation in Brain Activity Using Word Embeddings
In this paper, we utilize distributed word representations (i.e., word embeddings) to analyse the representation of semantics in brain activity. The brain activity data were recorded using functional magnetic resonance imaging (fMRI) when subjects were viewing words. First, we analysed the functional selectivity of different cortex areas by calculating the correlations between neural responses ...
متن کامل